skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Gomez, Jorge"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Low-latency and low-power edge AI is crucial for Virtual Reality and Augmented Reality applications. Recent advances demonstrate that hybrid models, combining convolution layers (CNN) and transformers (ViT), often achieve a superior accuracy/performance tradeoff on various computer vision and machine learning (ML) tasks. However, hybrid ML models can present system challenges for latency and energy efficiency due to their diverse nature in dataflow and memory access patterns. In this work, we leverage architecture heterogeneity from Neural Processing Units (NPU) and Compute-In-Memory (CIM) and explore diverse execution schemas to efficiently execute these hybrid models. We introduce H4H-NAS, a two-stage Neural Architecture Search (NAS) framework to automate the design of efficient hybrid CNN/ViT models for heterogeneous edge systems featuring both NPU and CIM. We propose a two-phase incremental supernet training in our NAS framework to resolve gradient conflicts between sampled subnets caused by different types of blocks in a hybrid model search space. Our H4H-NAS approach is also powered by a performance estimator built with NPU performance results measured on real silicon, and CIM performance based on industry IPs. H4H-NAS searches hybrid CNN-ViT models with fine granularity and achieves significant (up to 1.34%) top-1 accuracy improvement on ImageNet. Moreover, results from our algorithm/hardware co-design reveal up to 56.08% overall latency and 41.72% energy improvements by introducing heterogeneous computing over baseline solutions. Overall, our framework guides the design of hybrid network architectures and system architectures for NPU+CIM heterogeneous systems. 
    more » « less
    Free, publicly-accessible full text available January 20, 2026
  2. null (Ed.)
  3. Abstract The striking similarity between biological locomotion gaits and the evolution of phase patterns in coupled oscillatory network can be traced to the role of central pattern generator located in the spinal cord. Bio-inspired robotics aim at harnessing this control approach for generation of rhythmic patterns for synchronized limb movement. Here, we utilize the phenomenon of synchronization and emergent spatiotemporal pattern from the interaction among coupled oscillators to generate a range of locomotion gait patterns. We experimentally demonstrate a central pattern generator network using capacitively coupled Vanadium Dioxide nano-oscillators. The coupled oscillators exhibit stable limit-cycle oscillations and tunable natural frequencies for real-time programmability of phase-pattern. The ultra-compact 1 Transistor-1 Resistor implementation of oscillator and bidirectional capacitive coupling allow small footprint area and low operating power. Compared to biomimetic CMOS based neuron and synapse models, our design simplifies on-chip implementation and real-time tunability by reducing the number of control parameters. 
    more » « less
  4. We report the first experimental demonstration of ferroelectric field-effect transistor (FEFET) based spiking neurons. A unique feature of the ferroelectric (FE) neuron demonstrated herein is the availability of both excitatory and inhibitory input connections in the compact 1T-1FEFET structure, which is also reported for the first time for any neuron implementations. Such dual neuron functionality is a key requirement for bio-mimetic neural networks and represents a breakthrough for implementation of the third generation spiking neural networks (SNNs)-also reported herein for unsupervised learning and clustering on real world data for the first time. The key to our demonstration is the careful design of two important device level features: (1) abrupt hysteretic transitions of the FEFET with no stable states therein, and (2) the dynamic tunability of the FEFET hysteresis by bias conditions which allows for the inhibition functionality. Experimentally calibrated, multi-domain Preisach based FEFET models were used to accurately simulate the FE neurons and project their performance at scaled nodes. We also implement an SNN for unsupervised clustering and benchmark the network performance across analog CMOS and emerging technologies and observe (1) unification of excitatory and inhibitory neural connections, (2) STDP based learning, (3) lowest reported power (3.6nW) during classification, and (4) a classification accuracy of 93%. 
    more » « less
  5. Current rate of data generation and the need for real‐time data analytics can benefit from new computational approaches where computation proceeds in a massively parallel way while being scalable and energy efficient. Biological systems arising from interaction of living cells can provide such pathways for sustainable computing. Current designs for biocomputing leveraging the information processing units of the cells, such as DNA, gene, or protein circuitries, are inherently slow (hours to days speed) and, therefore, are primarily being considered for archival storage of information. On the contrary, electrically active cells that can synchronize in milliseconds and can be connected as networks to perform massively parallel tasks can transform biocomputing and lead to novel ways of high throughput information processing. Herein, coupled oscillator networks made of living cardiac muscle cells, or bio‐oscillators, is explored as collective computing components for solving computationally hard problems. An empirically validated circuit compatible macromodel is developed for the bio‐oscillators and the fibroblast cells acting as coupling elements, to faithfully reproduce the synchronization dynamics of the network and it is shown that such bio‐oscillator network can be scaled up to hundreds of nodes and be used to solve computationally hard problems faster than traditional heuristics‐based Boolean algorithms. 
    more » « less